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REASONS FOR REVIEW 

The examiner has finally rejected claims 1-26, which are all of the claims in the application. 
Applicant hereby requests review of the final rejection prior to filing an appeal brief for the reasons set 
forth below. 

BACKGROUND 

For active memory systems to be effective, the organization of data in the array of processing 
elements (PEs) is an important consideration. Hence, the provision of an efficient mechanism for moving 
data from one PE to another is an important consideration in the design of the PE array. [0006]^ In the 
past, several different methods of connecting PEs have been used in a variety of geometric arrangements 
including hypercubes, butterfly networks, one-dimensional strings/rings and two-dimensional meshes. In 
a two-dimensional mesh, the PEs are arranged in rows and columns, with each PE being connected to its 
four neighboring PEs in the rows above and below and columns to either side which are sometimes 
referred to as north, south, east and west connections. [0007] 

Disclosed in G.B. Patent Application Serial No. GB02215 630, entitled Control of Processing 
Elements in Parallel Processors, filed Sep. 1 7, 2002, is an arrangement in which a column select line and 
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a row select line can be used to identify processing elements which are active, e.g., capable of 
transmitting or receiving data. The ability to use a row select signal and a column select signal to identify 
active PEs provides a substantial advantage over the art in that it enables data to be moved through the 
array of PEs in a nonuniform manor. However, the need still exists for enabling PEs within the array to 
work independently of its neighboring PEs even though each PE within the array has received the same 
instruction. [0007] 

The present invention satisfies that need through a control scheme in which each PE maintains a 
count. The starting value (or ending value) of the count need not be the same amongst the processing 
elements and preferably is related to the PEs' location thereby making each count maintained within each 
processing element responsive to that processing element's location. Data is received by each processing 
element as a result of the array of PEs executing a global command, e.g., shift left, but data is selected as 
output data as a result of the local count. In the words of claim 1 , a method of controlling a plurality of 
processing elements is comprised of: 

issuing a command to a plurality of processing elements 
arranged in an array; 

maintaining a count in each of a plurality of processing elements, 
each count being responsive to a processing element's location in said 
array; 

receiving data in each of said plurality of processing elements 
from processing elements connected thereto in response to the execution 
of said command; 

selecting from among the received data, where each of the 
received data is a candidate for selection, one of the received data for 
output in response to that processing element's count; and 

saving said selected data. 

The cited prior art does not disclose or suggest such a control scheme. 

35 U.S.C, S 103 Rejections 
In paragraph 2 of the Office action, claims 1-2, 5-1 1, 15-16, and 19-26 stand rejected 

under 35 U.S.C. § 103(a) as being unpatentable over Taylor (U.S. Patent No. 4,992,933) in view 

of Barker (U.S. Patent No. 5,963,746). 

The operation of the present invention is described in the following paragraphs from the 

published application: 

[0066] In operation, an input matrix of data is placed on the shift network, and 
moved around by using a combination of north, south, east and west shifts. In 
addition, the column select register 59 and row select register 61 may be used to 
determine which of the PEs is active. The exact combination of active PEs, 
instructions, and direction in which the instruction (shift) is performed will 
depend upon the particular array manipulation required. As the instructions are 
executed and the shifting proceeds, each PE will be presented with different array 
values. For example, if a wrap shift is performed a number of times equal to the 
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number of PEs in a row, each PE in the row will see every value held bv all of 
the other PEs in the row , 

[0067] A PE can conditionally select any of the values it sees as its final output 
value by conditionally loading that value, which is representative of an output 
result matrix. However, only one value, the desired result, is loaded , (emphasis 
added.) 

There is no disclosure in Taylor of selecting from among the received data, where each of 

the received data is a candidate for selection, because Taylor uses a very different control scheme. 

In Taylor, the data arrives at the correct location at the end of the execution of the command. As 

discussed in the example in Taylor in column 9, beginning at line 36: 

[E]xactly M steps along the path leads to the correct processing element 
for the mapping. The North West quadrant of one possible way of setting 
out the set of loops for a 32 by 32 processor array is illustrated in FIG. 6. 
The remaining quadrants can be inferred by rotational symmetry. 

It will be noticed that some loops are shorter than others and some have a 
clockwise and some an anti-clockwise direction of shift as indicated by 
the arrows. However, the common factor for each of the loops is that a 
bit which is shifted 33 times along the loop on which it is located will 
end up in the corresponding position in the adjacent quadrant. In other 
words, in 33 steps, the whole array is rotated by 90 degrees, (emphasis 
added.) 

As is apparent from the foregoing quotation, in Taylor, data received at steps 
M-1, M-2, M-3, etc. are not candidates for selection, and no individual count is necessary 
for each processing element. 

In contrast, in the invention of claim 1, let's assume that M equals three and that counters 
in PEs A, B, C, and D are set to values as follows: 

Counter in PE A, M=3, 

Counter in PE B, M-1 =2, 

Counter in PE C, M-2=l and 

Counter in PE D, M-3=0 

The counter in each PE decrements the count by one each time a command is executed. 
PE D selects its original data as the output value because its counter value already equals zero. 
PE C selects the data that it receives after a shift command is executed once, because after one 
execution of the shift command, PE C's counter equals zero. In a similar manner, PE B selects 
the data that it receives after the shift command is executed twice and PE A selects the data that it 
receives after the shift command is executed three times. In that manner, all the data that a PE 
receives is a candidate for selection. In contrast, in Taylor, all the PEs of Taylor select the data 
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received at the end of M steps. Data received at steps M-3, M-2, and M-1 are not candidates for 
selection. 

The examiner's citation of the example in column 12 of Taylor serves to illustrate how 
different Taylor is from the claimed invention. Column 12, lines 16-48, of Taylor disclose the 
following: 

The third algorithm, which performs a rotation by 1 80 degrees, is 
illustrated in FIG. 1 1 . As will be apparent on studying this Figure, and in 
particular the two data paths represented by the heavy line 1 16 and the 
dashed line 1 1 8, the processing elements are programmed to decode a 
global shift instruction in two different ways in each quadrant making a 
total of eight ways in all. These are set out in the following table. 
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It can be seen from the data paths 1 1 6 and 1 1 8 that a data bit can be 
rotated by 1 80 degrees within the 8 by 8 processor array (e.g. from 
element 114 to element 106) in eight shifts or steps (i.e. along path 1 18). 
As each processing element handles two bits simultaneously, the average 
number of steps over 1 80 degree rotation is only four. This algorithm, 
like the others shown in FIGS. 9 and 10, can easily be generalized to an n 
by n array where n is even (e.g. a 32 by 32 array). 

This excerpt from Taylor highlights two points - common global instructions can be 
interpreted in different ways at the local (PE) level and the example shown in FIG. 1 1 of Taylor is 
another example of data being selected at the end of the execution of the command(s). The cited 
portions of Taylor do not teach or suggest maintaining a local count, i.e., a count in each PE, and 
selecting from among the received data, where each of the received data is a candidate for 
selection, one of the received data for output in response to that processing element's count. 

The addition of Barker does not supply the missing teachings. Even if Barker does 
provide a motivation to keep a count in individual processing elements, the control scheme of 
Taylor does not select from among the received data, where each of the received data is a 
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candidate for selection, one of the received data for output in response to that processing 
element's count. 

The limitations of claim 1 are found in all the other independent claims such that the 
rejection of independent claims 5, 8, 16, 20, 23, and 26 under 35 U.S.C. § 103 based on Taylor in 
view of Barker should be withdrawn. ^ 



Respectfully submitted. 




Edward L. Pencoske 

Reg. No. 29,688 

Jones Day 

One Mellon Center 

500 Grant Street, Suite 4500 

Pittsburgh, PA, USA, 15219 

(412)394-9531 

(412) 394-7959 (Fax) 

Attorneys for Applicant 



^ At this time, applicant has not addressed the rejection of the dependent claims but reserves the right to do 
so should it become necessary to file an appeal brief 
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